Correlation and Sampling in Relational Data Mining
نویسندگان
چکیده
Data mining in relational data poses unique opportunities and challenges. In particular, relational autocorrelation provides an opportunity to increase the predictive power of statistical models, but it can also mislead investigators using traditional sampling approaches to evaluate data mining algorithms. We investigate the problem and provide new sampling approaches that correct the bias associated with traditional sampling.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملCoDS: A Representative Sampling Method for Relational Databases
Database sampling has become a popular approach to handle large amounts of data in a wide range of application areas such as data mining or approximate query evaluation. Using database samples is a potential solution when using the entire database is not cost-effective, and a balance between the accuracy of the results and the computational cost of the process applied on the large data set is p...
متن کاملپرخاشگری رابطهای در کودکان پیشدبستانی
AbstractObjectives: This study aimed to investigate relational aggression in the preschool children in Shiraz as it causes harmful events for both the aggressive child and the other children. Method: In a descriptive cross-sectional survey, 258 children (119 boys, 139 girls) aged 3 to 7 years completed a 10-itemed questionnaire in the field of relational aggression for preschool children-teache...
متن کاملAnalyzing Correlation between Internationalization Orientation and Social Network
The research on social networks and collaborative strategies has highlighted from the mid of 1980 which has contributed to the success and development of firms. The relationship and communication with trade partners in overseas help success of firms in entering to foreign markets and improve new partners and new markets abroad. Doing firm internationalization in foreign countries faces some ba...
متن کاملA Resampling Technique for Relational Data Graphs
Resampling (a.k.a. bootstrapping) is a computationallyintensive statistical technique for estimating the sampling distribution of an estimator. Resampling is used in many machine learning algorithms, including ensemble methods, active learning, and feature selection. Resampling techniques generate pseudosamples from an underlying population by sampling with replacement from a single sample data...
متن کامل